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Abstract 

To treat the electronic structure of large molecules by electron propagator methods we devel- 
oped a parallel computer program called P— RICDS. The program exploits the sparsity of the 
two-electron integral matrix by using Cholesky decomposition techniques. The advantage of these 
techniques is that the error introduced is controlled only by one parameter which can be chosen as 
small as needed. We verify the tolerance of electron propagator methods to the Cholesky decompo- 
sition threshold and demonstrate the power of the P— RICDS program for a representative example 
(Ceo)- All decomposition schemes addressed in the literature are investigated. Even with moder- 
ate thresholds the maximal error encountered in the calculated electron affinities and ionization 
potentials amount to a few meV only, and the error becomes negligible for small thresholds. 
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I. I. INTRODUCTION 



In all advanced quantum chemistry calculations the evaluation of the huge number of 
two-electron repulsion integrals (ERIs) consistutes the major computational obstacle. It is 
thus not surpising that one searches for efficient methods to accurately approximate these 
integrals. Most known is the method of resolution of identity (RI), see, e.g. the reviews jl, 2| 
and references therein, which has been widely used to compute energies and other properties 
of molecules. A few decades ago Beebe and Linderberg have published an original paper 
about reducing computational efforts of ab initio methods by using Cholesky decomposition 
(CD) of the two-electron integral matrix 3)]. Within the CD method the accuracy of repre- 
sentation of the exact ERIs is controlled only by a single parameter, the so-called Cholesky 
decomposition threshold 5. By construction, it provides an upper bound to the absolute 
difference between an exact ERI and its approximated value. One of the key advantages of 
the CD approach is that the error introduced to the ERIs can be made as small as needed. 
Recently, the CD approach has been successfully combined with MP2 perturbation the- 
ory 14-61, with CASSCF configuration interaction methods 17-91 and with the CC2 linear 

nn 

response theory [10|, [TJJ • By using the CD approach one can speed up electron correlated 
calculations up to a few hundred times, thus enabling applications to large quantum systems 
(many atoms and many basis functions). Moreover, within the CD method one can perform 
correlated calculations of systems, which can not be attacked using ordinary conventional 
techniques. 

Our goal is to study the performance of CD in electron propagator calculations. We 
would like to mention that Flores-Moreno and Ortiz have investigated and applied the RI 
method in the context of electron propagator theory 12j. Electron propagator theory (EPT) 



has been proven to be a powerful tool for investigating the electronic structure of bound 



and unbound, metastable, states. Indeed, the 



of vertical ionization and attachment energies 13M20j , and more recently also of widths and 



is widely used for the direct calculation 



positions of short-lived electronic states (resonances) [2l|, |22] . In particular, methods have 
been formulated as a combination of the EPT jmd_a stabilization technique and successfully 
used for describing resonant states of anions 



23 



25|. 



Normally, the error of computed total energies (using standard methods like e.g. SCF, 
MP2, CASSCF, CASPT2) or response properties introduced by the CD approximation has 



2 



the same order of magnitude as S 



11 



HQ. In contrast, Beebe and Linderberg made 



the assumption that electron propagators should be tolerant to the Cholesky decomposition 
threshold: "... for propagator methods which directly compute energy differences, for an 
accuracy of 10~ 5 in the transformed two-electron integrals will be more than adequate for 
most purposes ..." js]. In the last few years substantial progress has been made in further 



optimization of the CD method, 
developed by F. Aqualiante et al. 



. n particular, a new, so-called atomic CD version has been 
28] . The main goal of our present work is to investigate the 
feasibility of using the CD approaches including the atomic version for electron-propagator 
methods. 



II. II. THEORY 



A. Factorization of two-electron repulsion integrals 



Let {x/j, (r) ; l</i<A} denote a set of real spatial basis functions and ${={pi (r) = 
Xix ( r ) Xv ( r ) ; I = (a*) z/ ), l<Ai<^<A, I<N(N + l)/2} denote the corresponding one-electron 
product density set. One of the basic quantities in quantum chemistry is a two-electron 
repulsion integral (ERI): 



X» (ri) Xv (ri) — Xx {r 3 ) X* (r 2 ) dT 1 dr 2 
ri2 



(Pi\V\pj) (1) 



where T\i — \v\ — r 2 |. 

Since ERI is a four-indexed quantity, both the number of distinct ERIs and the time 
required for their evaluation increase as iV 4 /8. O (N 4 ) scaling forms one of the basic ob 
stacles to ab initio calculations with a finite basis set. The Density Fitting (D' 



or Resolution of Identity (RI) 



35( and Cholesky Decomposition (CD) [3] methods 
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33| 



are very efficient approaches to reduce scaling. The key idea in all of these methods is to 
factorize an ERI into a product form of three- indexed quantities 36j: 

M 

(HAa) = 5>2^£- (2) 



K=l 



The scaling; now is O (N 3 ). This factorizat ion also significantly reduces I/O, memory de- 
mands and also reduces the scaling of the atomic orbital (AO) to molecular orbital (MO) 



transformation from 0(N 5 ) to 0(N 4 ) The last issue is crucial for post-HF methods. 
Denote the MO expansion coefficients matrix by C, then an ERI in MO representation can 
be calculated by exploiting the same factorization (J2]): 

M 

ipq\rs) = B«B« , (3) 

K=l 

where p, q, r, s denote MO indices and B^ q : 

Bpq = C^ p A^ u C uq . (4) 

DF/RI and CD approaches vary in the generation of the intermediate A^ v . In order to 
demonstrate the interplay between DF/RI and CD techniques we would like to derive the 
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(see also |3J). Let 



given factorization p]) by Lowdin's inner projection technique 
us introduce the expansion basis set of functions d\={pi (r) ; 1<I<M} in the domain of the 
Coulomb operator V=r 12 1 and construct a projection operator P as follow: 

P = E VV2 |^)^(PI|V 1/2 , (5) 

K,L 

where is the i^Lth element of the inverse V -1 and Vkl = {pk\V\^l). The inner 
projection of the Coulomb operator V on P takes on the form 

V = V^pv 1 ^ = V\pZ)VkI(p£\V , (6) 

K,L 

and provides the following approximation to an ERI: 

(Pi\V\ P j) = (pi\V\ Pj ) + D U ^Y1 (Pi\ v \fi<) V kI {Tl\V\pj) , (7) 

K,L 

where 

Q<{pi\V-V\pj) = D IJ , (8) 

is the error introduced by the incompleteness of the expansion basis set. The double sum 
in the right hand side of Eq. ([7j) is the famous V-approximation which comes from the RI 



methodology |34(. Noting that since V is a positive definite operator and V is the lower 



bound to it, Djj satisfies the Schwarz inequality: 

D^DtfDfi . (9) 



The difference between CD and DF/RI methods are the generation of the expansion basis 
set £H or, in other words, they differ in the fitting of the original density product set 91. Here, 
there are two possibilities to construct expansion basis sets: using a native subset of the 
original product density set or using an external auxiliary basis set. The CD method forms 
an expansion basis set from the original £H (9t C 91). It exploits the linear dependence of the 



original product density set in the Coulomb metric |3j, [39|, |40]. Thus, the CD method is the 
ab initio density fitting by construction (by design). For constructing a linear independent 
subset of 91 it is convenient to apply the classical Gram-Schmidt (CGS) method using V as 
the weighting factor: 



Pk 



n 



-1/2 



K 



K-1 



PK-^PI (pi\V\p K ) 



1=1 



(10) 



where the normalization constant is given by 



K-1 



n K 



(Pk\V\pk) ~ (Pi\ V \PkY 



(11) 



i=i 



This recursive procedure is continued as long as the norm of the new vector pu & (at step 
Ms) is greater than the threshold for linear dependency 5, i.e. i%k > <5, or equivalently, 



D 



KK 



< 5 



411 ] . The threshold 5 is known as the Cholesky decomposition threshold and 



the resulting non-redundant subset is known as the Cholesky basis. As a rule of thumb, if 
5 = 10~ p , then [4^: 

pN < M s < (p + l)N. (12) 

It is interestingly to note that the same idea of using the CGS procedure for generating 
expansion basis set has been suggested by Whitten in the frame of the DF approach about 



30 years ago 



331 ] . By taking into account that (p/|^|pj) =Su and returning to the original 



set of pair of indices, the final expression of Eq. ([7]) in the Cholesky basis takes on the form: 



M 



M 



(pu\Xa) w (HPiO (pK\Xa) = L ^ L t , M = min(M s , (pu) ^ I, (Xa) ^ j) , (13) 

K=l K=l 

where the intermediate L^ u is the projection of the \pv) product on the i^th vector of The 
vector L K ={L^; l<p<v<N} is known as integral table. The number of ERIs needed for the 



generation of Ms integral tables is MsN 



Cholesky decomposition algorithm 4j, 
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'N + l)/2 (0(N 3 )). In actual practice a modified 
43| is used for calculating the intermediates L^ u 



instead of CGS because the latter is numerically unstable 



44] 



The Cholesky basis has three important features. First, it is optimal in the sense that it 
provides a rigorous upper bound for the value Du, i.e.: 

D u < 5 (14) 

An important point that if pi and pj have been involved in the CGS orthonormalization 



process, then the corresponding integral is represented exactly 



43|, i.e., 



D u = e, (15) 

where e is the machine epsilon. This means that M$ (Ms + 1) /2 integrals corresponding to 
;he product densities of 9^ will be generated exactly (within machine precision) by Eq. ( TT5j) 
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3|, |43] . Second, the Cholesky basis is continuous in the sense that by decreasing the threshold 
S the inner projection V smoothly becomes better and better and formally exact when 
asymptotically S— >0, i.e.: 

limV=V and HmDrj — e. (16) 

In the finite precision arithmetic this limit is reached with a Cholesky threshold less or equal 
to 10~ 10 (5 < 10~ 10 ). Thus, by using the CD approximation the error introduced to ERI can 
be made as small as desired. This features is a great advantage of the CD method. Third, 
the Cholesky basis set can be constructed on the fly for any arbitrary AO basis set, even for 
non-standard ones. Such a flexibility is important when calculating short-lived electronic 
states, where it is essential to extend an original AO basis set by adding external sets of 
basis functions for describing the continuum wavefunction 45-4^]. 

The disadvantage of the full-CD is the total price in the number of operations (multipli- 
cation, addition, square roots and array indexing is oc iV 4 ). By construction, the Cholesky 
basis generally contains both one-center and two-center one-electron densities which lead 
to the calculation of expensive four-center ERIs and, more important, to a dependence of 



the Cholesky basis on the molecular geometry [48]. To overcome these shortcomings a so- 
called atomic CD (aCD) modifications has been developed recently. In the aCD version the 
Cholesky basis contains only one-center product densities for each unique atom/AO basis 
set pair which reduces the computational effort considerably and leads to smooth potential 



energy surfaces 28j, |48[. The aCD (atomic Cholesky) basis set provides strict error control 
( IT4")) on the one-center and two-center "Coulomb" ERIs but three- and four-centers inte- 
grals and "Exchange" two-center integrals may be affected by large errors [49]. Unlike the 



full-CD approach, the accuracy of the aCD method cannot be improved beyond a certain 
limit, which is weakly dependent on the decomposition threshold. Nonetheless, recently it 
has been shown that the aCD approach does not downgrade the accuracy to any significant 



degree and the introduced error is slightly larger than that of the original full-CD |7J, |28|, 148 1. 

There is a further optimization of the aCD basis, the so-called atomic compact CD (acCD) 
basis set. The acCD is the result of removing the linear dependence among the primitive 



Gaussians |49|. As has been demonstrated recent 



aCD and acCD are very close to each other 



49 



y 



in a serie of papers, the accuracy of 



50| | . It is important to stress that the 



Cholesky basis sets are generated from first principles and are not biased to any quantum 



5l|- 



chemical method. For a detailed discussion on atomic CD we refer to the review 

In contrast to the above mentioned ab initio Cholesky basis sets, the DF/RI auxiliary 
basis sets have been optimized to reproduce accurately certain specific quantities. The key 
point in the development of auxiliary basis sets is to provide a balance between the accuracy 
of the computed quantities and the numerical effort required. The balanced auxiliary basis 
set should fulfil two requirements. First of all its size should be only a few times (3-4) 
larger than the size of the original AO basis set. Second, the error introduced due to the 
DF/RI approximation should be at least one order of magnitude smaller than the error 
resulting from one-electron basis set incompleteness. Technically, an auxiliary basis set is 
optimized by minimizing the deviation between the exact target quantity and that calculated 
via the DF/RI approximation in a set of atomic and molecular calculations for a certain 
combination of pair AO basis set and level of theory. One of the best approved family of 
auxiliary basis sets is RI-X (X=J,JK,C), where X refers to the theoretical method used in 



the parameterization 52^ 58 1 . 

In particular, the RI-JK family of auxiliary basis sets was developed to reproduce the HF 
coulomb and exchange energies: 

E S CF = Y,{2(ii\jj)-(ij\ij)) , (17) 
whereas RI-C was invented to reproduce the MP2 correlation energy: 

MP2c = ( 2 M 6 j) - («jN) • Hji) ^ 

T £7 &a 

0,6 

where a,b denote virtual, and i,j denote occupied HF orbitals. Later on Hattig et al. 



demonstrated that the RI-C auxiliary basis sets also suitable for calculating ground state 



and excitation energies at the CC2 level of theory [2j, |59|. The typical errors produced by 
RI-X auxiliary basis sets for a target quantity are about a fraction of few tenths or even few 
hundredth of mE^ per atom. However, in the case of an inappropriate combination of AO 
and RI-X basis sets and level of theory the resulting error can increase drastically and be a 



few orders of magnitude larger than suggested in the original papers [60|]. The other issue 
is the application of the DF/RI scheme in combination with augmented AO basis sets. For 
example, an augmented basis set can be obtained from the original AO set by adding some 
special basis functions. Potentially, the use of such combinations might lead to inconsistent 
results. This possibility is related to the fact that the standard published auxiliary basis 
sets have not been developed for fitting the product densities resulting from the augmented 
basis functions, i.e. they do not contain suitable functions to span the product densities 
resulting from the added new basis functions. 

According to published data one may conclude that the RI-X bases have the same qu ality 



as aCD/acCD with a Cholesky decomposition threshold in the range 10~ 4 to 10 -6 28|, l60 |. 
Typically, the Cholesky basis sets are somewhat larger than the corresponding RI-C ones 
and , therefore, require a higher computational cost which is a reasonable price to pay for 
an unbiased and highly accurate auxiliary basis set. 

B. Green function method 

Green's function or propagator theory is well established and we just briefly describe it 
also mentioning some computational details. Most of one-particle propagator methods are 



based on the well-known Dyson equation 
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6l|: 



G(w) = Go(w) + Go(u)S( W )GH, (19) 

where G(u) is the one-particle Green's function, S(co>) is an energy dependent non-local 
potential called self-energy and Go(cj) is the free Green's function. Formally, the ionization 
potentials (IPs) and electron affinities (EAs) of the system under consideration are poles of 
G(w): 

G(w) = (w • l-e-S(w)) -1 , (20) 



S 



where e is the diagonal matrix of the canonical HF one-electron energies and 1 is a unit 
matrix. For convenience, the poles of G(u) can be found from 

det(wl-e-S(w))=0. (21) 

The self-energy S(w) itself also possesses a spectral representation [14] and this can be 
used to solve Eq. ( )2T|) by solving an eigenvalue equation. If the elements of Sfo;) are explicitly 
known, the solution of the Dyson equation becomes particularly compact |62j ■ In this work 
we resort for simplicity of representation to the so-called quasi-particle approximation, in 
which the self-energy is diagonal and Eq. (J2"T|) reduces to: 



u = e q + E qq (uj) , (22) 
where q is the MO index related to the IP or to the EA we are looking for. In practice 



(23) 



Eq. (122]) is solved iteratively via the Newton-Raphson method 

^ n+1) = < + (e q + E W ( W W) - W W)PW 
where P q is the so-called pole strength: 



p(n) 



dco 



,00 



-1 



< P q < 1 . 



(24) 



(o) 



The iterative procedure usually starts from a HF one-electron energy of the gth MO (u, 
e q ) and continues until the absolute difference between previous and current values of a pole 



is smaller than some given threshold, say A = 10 



(n+l) 



(n) | < A. 



(25) 



There are several successful approximations to S(w) in the literature |l5l.l63H68|. Here, for 
simplicity of presentation we employ the well-known expression of second-order perturbation 
theory. In the second order, the gth diagonal element of the self-energy reads 

2(qi\aj) - (qj\ai))(qi\aj) ^ (2(qa\ib) - (qb\ia)) (qa\ib) 



4? M 



E 

a 



CO + e a - Ei - e. 



i 

a,b 



LO + Ei — E a — Ef, 



(26) 



where a,b denote virtual, and i,j occupied spatial HF orbitals. The second order contains 
the most relevant relaxation and correlation corrections to Koopman's theorem 69| and self- 
energy provides convenient checks of new computer codes and the computational experience 
necessary to implement more general approximations. 



III. COMPUTATIONAL DETAILS 



All calculations presented here are on Ceo which is an ideally suited object as it contains 
many atoms and is of general interest documented by numerous investigations. All calcula- 
tions of ground and ionic states of Cqq were performed in D 2 h symmetry at the experimental 



gas-phase geometry: Rqc = 1-458 A and Rcc = 1-401 A [70J. The AO basis sets used in 
this work are Dunning's cc-pVXZ (X=D,T) basis sets 71| and the respective total number 
of basis function are 840 and 1800. Throughout, the spherical representation of the d- and 



f-basis functions was used. The ca' 
quantum chemistry prog ram 



43 



the TURBOMOLE 



73 



culations with CD were performed by using the MOLCAS 
The DF/RI MP2 calculations were carried out using 



]|? 4751 ] quantum chemistry package with suitable RI-C and RI-JK 
auxiliary basis sets corresponding to the original set AOs [57J. Both CD and DF/RI calcula- 
tions were done on Intel® Xeon® E5440 (2.83GHz) and AMD Opteron™ 2220 (2.80GHz) 



based supercomputers 



76 



771 ] . Fully direct HF and MP2 calculations were carried out with 



the PC GAMES S /Firefly program suite [78 



791 1 . For open shell calculations at the HF level 

n 

the restricted open-shell approach (ROHF) has been employed 80] . 

For calculating IPs and EAs in the quasiparticle approximation with T} qq we developed 
a parallel program called P— RICDE. As input data the P— RICDE uses integral tables in 
AO representation and SCF MO LCAO coefficients which are generated with the MOLCAS 
program. Execution of P— RICDE consists in two separate steps: parallel transformation 
of integral tables from the AO to the MO representation by Eq. (jl]) and iterative solving 
Eq. (|2"3"|) . The ERIs needed during the iterative solution are recomputed in parallel by 
formula (fT3"|) . More details about the structure and parallelization of P— RICDE will be the 
subject of a forthcoming manuscript. 

All electron propagator calculations in the present paper were fully correlated, i.e. all 

(2) 

electrons were taken into account at the T} qq level of theory. For the energy conversion of 
units the factor 1 hartree (Eh) =27.211396 eV was used. 

IV. RESULTS AND DISCUSSION 



First, we would like to introduce the abbreviations used in this chapter. CD-n refers to 
the full-CD decomposition with threshold 5 = 10~ n . aCD-n* and acCD-n* mean "atomic 
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Cholesky" basis and its compact form, respectively. aCD-n or acCD-n basis sets have been 
formed from the original aCD-n*/ acCD-n* ones by removing the highest angular momentum 
orbitals. The term " low Cholesky" stands for the full-CD results obtained with a decom- 
position threshold 5 in the range from 10~ 4 to 10 -6 , and "medium Cholesky" and "high 
Cholesky" are for thresholds in the range 10~ 6 to and 10~ 8 to 10~ 10 , respectively. 



A. Ground state (GS) 

We would like to start the discussion with MP2 ground state energies because the expres- 
sions for MP2, Eq. f lTHj) . and Yy qq , Eq. ( 126]) . are quite similar. For the sake of convenience 
and clarity, we decompose the MP2 total energy into a sum of two contributions 

E Total — E S CF + E M P2 C , (27) 

where E$cf and E MP2c are the HF energy and the MP2 correlation energy, respectively. 
With the decomposition given in Eq. fl27|) . the total MP2 energy error takes on the form 



CMP2 — £HF + £MP2 C , (28) 

where e x = \E^ pprox - Ef Tect \ for X = HF, MP2 C . 

Figure 1 shows the total MP2 energy error euP2 and its contributions €hf and €mp2 c - 
First of all, Figure 1 clearly represents the accuracy of approximations used: the high 
Cholesky results approach the exact one, DF/RI and the low Cholesky results are the least 
accurate and atomic CD results are between them. Within the atomic CD series, one 
can see that the aCD*/acCD* results are on the high-accuracy side, whereas its optimized 
(reduced) aCD/acCD version are on the lower-accuracy side. We should stress that the 



50] 



observed tendencies are in full agreement with previously published data 

The next important issue concerns the convergence rates of the energies of the full-CD 
approximation. From Figure 1, it becomes evident that the main source of error here comes 
from the HF method. We now analyze this trend in more detail. Table I reports the 
total MP2 energies calculated by using full-CD and DF/RI approximations. By comparing 
results from the 3-rd and 4-th columns or from the 6-th and 7-th columns with the reference 
numbers, we conclude that the correlation energy (MP2q) converges much faster than the 
corresponding HF energy (SCF). We attribute this behaviour to the fact that the HF total 
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energy is an expectation value, while the MP2 correlation energy is a correction quantity. 
Using CD or DF/RI leads to loss in accuracy of the computed ERIs and it is well-known 
that computing corrections is numerically more robust with respect to the precision of input 
data, than computing expectation values due to cancellation of errors. 

In view of the results of this chapter we expect propagator methods to be robust with 
respect to Cholesky decomposition thresholds. 



B. Cationic and anionic states 



The GS of Ceo is 1 A ig . Removing a single electron yields an 2 H U cationic ground state, 
while the attachment of an extra electron yields a 2 Ti„ anionic ground state. The anion Cg 
is known to be bound by 2.68 eV in the gas phase [811 ] . The first adiabatic ionization energy 
of Ceo has been estimated to be 7.64 eV 82l485| . 



Table II lists the results for the first vertical IP and first EA obtained by two uncorrelated 
approximations: by applying Koopmans Theorem (KT) 86j] and by the Delta-SCF (ASCF) 
method j§7]. KT results are obtained from the HF calculations on the neutral fullerene. 
The IP is obtained from the energy of the HOMO and the EA - from the energy of the 
LUMO. ASCF refers to the difference between the total HF energy of neutral and that of 
its ions. From Table II, we can see that the calculated quantities depend only slightly on 
the decomposition threshold. The maximal error introduced by the CD approximation is 1 
meV or less. In contrast to the HF total energy (see Table I), the ASCF results converge 
much faster. For a threshold of 5=10~ 5 or less, the error in the computed IP and EA is 
negligible. This finding clearly verifies the predicted robustness of relative quantities to the 
decomposition threshold. 

In Table III and Table IV, we present main results of the present paper. These tables 
report the calculated first IP and EA employing various Cholesky decompositions and va- 

(2) 

lence basis sets at the Eg 9 level. As one would expect, the computed quantities are not very 
sensitive to the Cholesky decomposition threshold. Even the low Cholesky results have a 
very encouraging level of accuracy of 1 meV. Starting from <5=10 -6 , the medium Cholesky 
results approximate the exact ones (£=1CT 10 ) fairly well. Clearly, the full Cholesky basis 
sets provide superior convergence in calculating first IPs and EAs. 

Interestingly, the atomic CD results resemble the full-CD ones. From Table III and IV, 
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it is clearly seen that here there is no difference in the results obtained using original atomic 
Cholesky basis or its compact form (aCD-n vs acCD-n or aCD-n* vs acCD-n*). In the case 
of atomic CD, the most pronounced changes occur by removing higher orbital products, i.e. 
in going from aCD-n* to aCD-n or from acCD-n* to acCD-n Cholesky basis sets. Within 
aCD-n/acCD-n Cholesky basis sets, the maximal error in the first IP/EA potential is 2 meV. 

In order to investigate in more detail the influence of Cholesky basis sets on the calculated 
IP and EA, we extend the energy window of the calculated potentials. Now, all canonical 
HF orbitals lying in the energy range from -14.272 to 4.088 eV in the cc-pVDZ basis set, and 
from -14.303 to 9.903 eV in the cc-pVTZ basis set, are taken into account, and only poles 
of the propagator which are related to quasiparticles are considered in the subsequent error 
analysis. Before discussing further, we would like to identify two issues: the reference results 
and the precision of the statistical results. In the error analysis of the computed spectrums 
the sgVCD-10 results are used as reference. Since we use A = 10 5 as the convergence 
threshold in the Yy qc j iterative procedure ( l25l) . all computed statistical characteristics which 
are below the given threshold should be considered zero. The reference spectra (IPs, EAs 
and their pole strengths as a function of energy) of Ceo obtained with the cc-pVDZ and 



cc-pVTZ valence basis sets are shown in Figure 2. While calcu 
available for the IPs even beyond second order self-energies 



ations using propagators are 
, no ab initio calculations on 



the EAs of Ceo have been reported so far because of the large computational effort involved. 
Experimental IPs and EAs are available in literature, see e.g. 8ll435j]. In Figure 2, we see 
that enlarging the basis set shifts the EAs and IPs to lower energies by an amount which 
is approximately constant for each of these groups of quantities. The second order under- 
stimates the first IP and inclusion of higher order corrections is essential for a quantitative 
prediction (see OVGF and ADC(3) results in Table II). In this work, we are not concerned 
with the absolute quality of the self-energy used. We are rather concerned with the accuracy 
of the calculations using a given self-energy and the CD technique. In addition, utilizing the 
CD enables us to attack larger molecules which cannot computed otherwise. 

We now turn to the error analysis of the various CD approximations used. The data 
shown in Figure 3 and Figure 4 confirm the observed feature of the full Cholesky basis sets 
discussed above. As in the case of the first IP and EA, the full-CD spectral results for 
5 = 10~ 6 or less are exact within the prescribed accuracy limit (10 /j,E^ or 0.27 meV). From 
the figures, it is also evident that the full-CD results depend on the decomposition threshold, 
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but they are only slightly affected by the quality of the AO basis set used. In other words, 
full-CD basis sets introduce nearly uniform spectral errors which only slightly depend on the 
AO basis set used. In particular, the RMAX (RMS) value for CD-4 are 265 (81) and 262 
(67) //Eh in the cc-pVDZ and cc-pVTZ basis sets, respectively. Consequently, low Cholesky 
results have a reasonable accuracy of a few meV, a finding which is of practical relevance. 

In contrast to the full-CD results, the atomic CD ones are sensitive to the quality of 
the AO basis set used. By going from double- to triple-zeta-quality AO basis sets, we 
substantially decrease the overall error (RMAX and RMS are reduced by factor of 7) of 
the spectra computed using aCD-n/acCD-n (n=4,6) Cholesky basis sets. In the case of 
aCD-n*/acCD-n* Cholesky basis sets the improvements are not so pronounced. In the triple- 
zeta basis set, the aCD-n*/acCD-n* (n=4,6) results are only 1.07 to 3.63 times better than 
in the double-zeta set. For cc-pVTZ AO, the aCD-4*/acCD-4* and aCD-6/acCD-6 Cholesky 
basis sets provide sub-milli-electron-volt accuracy, the corresponding value of RMAX and 
RMS are about 18 and 5 //E h , respectively. We note that the aCD-6*/acCD-6* results in 
both AO basis sets used coincide with the reference ones. 

Interestingly, the aCD-4/acCD-4 results in the cc-pVTZ AO basis set have milli-electron- 
volt accuracy, which is much better than the accuracy of the CD-4. For instance, the RMAX 
(RMS) value in the cc-pVTZ basis set for aCD-4/acCD-4 and CD-4 are about 46 (16) and 
262 (67) //E h , respectively. 

In order to characterize the origin of errors in the spectra due to CD, we have made 
a linear regression analysis between the reference results (CD-10) and the others. The 
resulting correlation coefficients are 1.00000. Figure 5 displays a typical picture resulting 
from the regression analysis. Obviously, the error due to CD is systematic. Therefore, the 
CD approach leads essentially to a uniform shift of the whole spectrum. Even in the worst 
case (CD-4) addressed in Figure SI in the supplementary material, this shift amounts only 
to 1 meV. 

C. Efficiency 

In order to demonstrate the computational power of the developed P— RICDS program 
we would like to provide some timings. As is well-known one of the main bottlenecks of 
quantum chemistry is the transformation of ERIs from AO to MO representation. For the 
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P— RICDE program running on 150 cores (50 nodes x 3 cores) the typical timings of the 
AO to MO transformation in the cc-pVTZ valence basis set range from 20s to 100s within 
the full-CD series and from 9s to 25s in case for the aCD ones. By using the same number 
of cores the wall time needed to pass one NR iteration f l2"5"j) is between 15s and 25s. Recall 
that the total number of basis functions in cc-pVTZ for Cgo is 1800. 

Another important point is the timing for computing integral tables via the CD technique. 
Figure 6 depicts the time needed to complete CD in various Cholesky and valence basis sets. 
As one can see, the aCD computational scheme is at least one order of magnitude faster 
than the full-CD ones. In particular, the CD-10 is about 100 times slower than aCD-4. 

V. CONCLUSIONS 

In the present work we demonstrate the robustness of the one-particle electron propaga- 
tor method with respect to Cholesky decomposition schemes for two-electron integrals. All 
decomposition schemes reported in the literature are used. We found that even for mod- 
erate Cholesky decomposition thresholds (> 10 -5 ) the maximal error in computed electron 
affinities and ionization potentials is rather small (~1 meV) and is typically several orders of 
magnitude smaller than the error arising from the incompleteness of the AO basis sets used. 
The full Cholesky decomposition exhibits excellent convergence properties with respect to 
the decomposition threshold. For electron propagator methods there is no need to use small 
(< 10 -7 ) thresholds. The atomic Cholesky basis sets speed up the calculations by several 
orders of magnitude without leading to a significant loss in accuracy. In particular, we con- 
clude that acCD-n and acCD-n* (n=4,6) Cholesky basis sets provide optimal compromise 
between performance and accuracy. 

The error introduced by the Cholesky decomposition has a systematic behavior. Varying 
the decomposition threshold leads to a nearly uniform shift of the energy of the whole 
spectrum, i.e. all calculated poles are shifted by about the same value. 

We want to stress that the results presented could be obtained in a reasonable time 
only because of the efficient parallel algorithm employed and by utilizing a massive parallel 
computer. By using the Cholesky decomposition technique and parallel computing one is 
now able to perform large-scale electron propagator calculations, which were impossible 
before via conventional techniques. This opens up wider perspectives in modelling large 
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molecular systems. 
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FIGURE CAPTIONS 



Figure 1: (Color online) Ground state results on Cqo. Total MP2 energy errors and their 
contributions employing different Cholesky basis sets for cc-pVTZ valence basis set. All 
errors computed as absolute deviation from the corresponding results of MP2(FC)/cc-pVTZ 
fully-direct calculation. Note that the errors are given in mEh while the inset is in //Eh. 
The basis set contains 1800 functions. 

(2) 

Figure 2: (Color online) The E^/CD-IO electron spectra of C 60 computed in the 
cc-pVDZ and cc-pVTZ valence basis sets. EAs and IPs are shown on the left and right 
hand sides, respectively. 

Figure 3: (Color online) Maximal absolute error of the computed E^ electron spectra 
of Ceo employing different Cholesky basis sets for cc-pVDZ and cc-pVTZ valence basis sets. 
All errors are computed relative to the E^/CD-10 spectra. The dashed horizontal line 
displays the predefined level of accuracy (10 //E h ). 

Figure 4: (Color online) RMS error of the computed E^ electron spectra of C 60 
employing different Cholesky basis sets for cc-pVDZ and cc-pVTZ valence basis sets. All 

(2) 

errors are computed relative to the E^/CD-IO spectra. The dashed horizontal line displays 
the predefined level of accuracy (10 //E h ). 

Figure 5: Correlation between E^/CD-10 and E^ /acCD-4 results for C 60 in cc-pVTZ 
valence basis set. In parenthesis is displayed the correlation coefficient (R). 

Figure 6: (Color online) The relative timings (t) of the Cholesky decomposition 
performed within different Cholesky and valence basis sets. The aCD-4 timings (t a cD-4) 
are used as reference. For 150 cores (50 nodes x 3 cores) those timings are 6.8 and 106 
seconds for the cc-pVDZ and cc-pVTZ valence basis sets, respectively. 

Figure SI: Correlation between nJJ/CD-10 and eSJ/CD-4 results for C 60 in cc-pVTZ 
valence basis set. In parenthesis is displayed the correlation coefficient (R). 
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TABLE I: Total MP2 energies {Eh) for the ground state of Ceo and their decomposition into SCF 
(HF) and correlation (MP2c) energies. Listed are the results obtained for different DF/RI and 
full-CD decompositions for two basis sets cc-pVDZ and cc-pVTZ (840 and 1800 basis functions, 
repsectively) . These results are compared with the "exact" ones. 



Method 






cc-pVDZ 






cc-pVTZ 






















SCF 


MP2 C 


MP2 


SCF 


MP2 C 


MP 2 


RI 


C 


-2271.947700 


-7.741105 


-2279.688805 


-2272.396894 


-9.250503 


-2281.647397 




JK-C 








-2272.395429 


-9.249306 


-2281.644736 




4 


-2271.932301 


-7.736309 


-2279.668610 


-2272.390452 


-9.249999 


-2281.640451 




5 


-2271.946702 


-7.741485 


-2279.688187 


-2272.396589 


-9.251553 


-2281.648142 




6 


-2271.947519 


-7.741778 


-2279.689297 


-2272.396760 


-9.251745 


-2281.648505 


CD 


7 


-2271.947671 


-7.741839 


-2279.689510 


-2272.396871 


-9.251791 


-2281.648661 




8 


-2271.947697 


-7.741858 


-2279.689556 


-2272.396891 


-9.251797 


-2281.648689 




9 


-2271.947699 


-7.741857 


-2279.689556 


-2272.396894 


-9.251799 


-2281.648693 




10 


-2271.947700 


-7.741857 


-2279.689557 


-2272.396894 


-9.251798 


-2281.648692 


Direct 




-2271.947700 


-7.741856 


-2279.689556 


-2272.396894 


-9.251798 


-2281.648692 
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TABLE II: Vertical IP and EA of C 60 at the HF level of theory (eV). The last significant digit 
underlined here (in comparison with the corresponding results of the direct calculations). 

IP EA 

Method 

KT ASCF KT ASCF 



cc-pVDZ 



CD-4 


7.809 


7.504 


0.767 


0.993 


CD-5 


7.810 


7.505 


0.768 


0.994 


CD-6 


7.810 


7.505 


0.768 


0.994 


CD-7 


7.810 


7.505 


0.768 


0.994 


CD-8 


7.810 


7.505 


0.768 


0.994 


CD-9 


7.810 


7.505 


0.768 


0.994 


CD-10 


7.810 


7.505 


0.768 


0.994 


Direct 


7.810 


7.505 


0.768 


0.994 






cc- 


pVTZ 




CD-4 


7.797 


7.459 


0.805 


1.059 


CD-5 


7.798 


7.460 


0.807 


1.061 


CD-6 


7.798 


7.460 


0.807 


1.061 


CD-7 


7.798 


7.460 


0.807 


1.061 


CD-8 


7.798 


7.460 


0.807 


1.061 


CD-9 


7.798 


7.460 


0.807 


1.061 


CD-10 


7.798 


7.460 


0.807 


1.061 


Direct 


7.798 


7.460 


0.807 


1.061 



OVGF 7.65° 
ADC(3) 7.68° 
Exp. 7.64 6 2.68 c 



"Reference 88. 
^Reference 82-85. 
c References 81. 
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TABLE III: Vertical IP of C 60 at the T, qq ' level of theory (eV). The pole strengths are given in 
parenthesis. The last significant digit is underlined here (as a reference the CD-10 results are used). 

-log (S) 

Method 

4 5 6 7 8 9 10 

cc-pVDZ (0.802) 

CD 6.947 6.947 6.948 6.948 6.948 6.948 6.948 

aCD* 6.947 6.947 6.947 6.948 6.948 6.948 6.948 

acCD* 6.947 6.947 6.947 6.948 6.948 6.948 6.948 

aCD 6.946 6.947 6.948 6.948 6.948 6.948 6.948 
acCD 6.946 6.947 6.948 6.948 6.948 6.948 6.948 
cc-pVTZ (0.793) 
CD 7.118 7.118 7.118 7.118 7.118 7.118 7.118 

aCD* 7.119 7.118 7.118 7.118 7.118 

_a 

acCD* 7.119 7.118 7.118 7.118 7.118 

aCD 7.119 7.119 7.119 7.119 7.119 7.119 7.119 
acCD 7.119 7.119 7.119 7.119 7.119 7.119 7.119 



"Calculation failed due to some internal restriction of MOLCAS. 
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TABLE IV: Vertical EA of C 60 at the level of theory (eV). The pole strengths are given m 
parenthesis. The last significant digit is underlined here (as a reference the CD-10 results are used). 



-log (S) 

Method 

4 5 6 7 8 9 10 

cc-pVDZ (0.819) 

CD 2.753 2.754 2.754 2.754 2.754 2.754 2.754 

aCD* 2.753 2.754 2.754 2.754 2.754 2.754 2.754 
acCD* 2.753 2.754 2.754 2.754 2.754 2.754 2.754 

aCD 2.752 2.753 2.755 2.755 2.755 2.754 2.754 
acCD 2.752 2.753 2.754 2.755 2.755 2.754 2.754 
cc-pVTZ (0.815) 
CD 3.110 3.111 3.111 3.111 3.111 3.111 3.111 

aCD* 3.111 3.111 3.111 3.111 3.111 
acCD* 3.111 3.111 3.111 3.111 3.111 

aCD 3.111 3.111 3.111 3.111 3.111 3.111 3.111 
acCD 3.111 3.111 3.111 3.111 3.111 3.111 3.111 



"Calculation failed due to some internal restriction of MOLCAS. 
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